12.06.2019

Introduction

Results of our little survey:

Name Field Research
Henrik Bibliometrics Statistics, coding, prose

Topics du jour:

  • The modern research cycle
  • The Open Science movement
  • Incorporating workflow thinking into your research

Part I: The modern research project

An idealised research project

As a researcher, in many ways, this is how you would want the ideal research project to look:

Requirements

…but then, you are not the one with the money (yet).

  • Data management plan
  • Publication plan
  • Dissemination plan

Data management plan

Research data should be shared and reused more widely […] Better access to research data can boost innovation and value creation by enabling actors outside the research community to find new areas of application.

National strategy on access to and sharing of research data

  • What do you collect?
  • How do you treat it?
  • How will you keep/share it?

Publication plan

  • Where do you plan to publish?
  • What part of the project will make it into which publications?

The publishing cycle

The publishing cycle, really

Dissemination plan

  • How will you present your research?
  • In which channels?

Social media

A more realistic project plan

Is this you?

Why all this stuff?

[W]e have two major points to consider. First, due to a lack of adequate incentives in the reward structure of professional science […] actual replication attempts are rarely carried out. Second, to the extent that they are carried out, it can be well-nigh impossible to say conclusively what they mean, whether they are “successful” (i.e., showing similar, or apparently similar, results to the original experiment) or “unsuccessful” (i.e., showing different, or apparently different, results to the original experiment).

Earp, B. and D. Trafimov (2015) Replication, falsification, and the crisis of confidence in social psychology. Frontiers in Psychology

This is the whole abstract of an interesting paper in the field of genomic biology:

The spreadsheet software Microsoft Excel, when used with default settings, is known to convert gene names to dates and floating-point numbers. A programmatic scan of leading genomics journals reveals that approximately one-fifth of papers with supplementary Excel gene lists contain erroneous gene name conversions.

Ziemann, M., Y. Eren, A. El-Osta (2016) Gene name errors are widespread in the scientific literature. Genome Biology 17:177

Storytime

Here are some rows of some of the columns:

s4 s6 s7 s8 s9
4 4 1 NA 46
3 1 1 NA 125
3 1 1 NA 90
3 3 1 NA 156
4 5 1 NA 78
  • Only problem: I don’t know where I put the codebook!

An all-too familiar story

Part II: How to deal with this?

Just don’t do it

Try to integrate the issues into your workflow

Part III: Examples of digital workflows

Collaborating

  • From simple to

Keeping track

Documenting

Sharing

The trade-offs

  • There are powerful, efficient tools at our disposal
  • There is a learning curve of varying steepness
  • Maybe

Resources